Rule Filtering by Pattern for Efficient Hierarchical Translation

نویسندگان

  • Gonzalo Iglesias
  • Adrià de Gispert
  • Eduardo Rodríguez Banga
  • William J. Byrne
چکیده

We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on the number of non-terminals and the pattern, and various filtering strategies are then applied to assess the impact on translation speed and quality. Results are reported on the 2008 NIST Arabic-toEnglish evaluation task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical phrase-based translation with weighted finite state transducers

This dissertation is focused in the Statistical Machine Translation field (SMT), particularly in hierarchical phrase-based translation frameworks. We first study and redesign hierarchical models using several filtering techniques. Hierarchical search spaces are based on automatically extracted translation rules. As originally defined they are too big to handle directly without filtering. In thi...

متن کامل

Forced Derivations for Hierarchical Machine Translation

We present an efficient framework to estimate the rule probabilities for a hierarchical phrasebased statistical machine translation system from parallel data. In previous work, this was done with bilingual parsing. We use a more efficient approach splitting the bilingual parsing into two stages, which allows us to train a hierarchical translation model on larger tasks. Furthermore, we apply lea...

متن کامل

Hierarchical Phrase-Based Translation with Suffix Arrays

A major engineering challenge in statistical machine translation systems is the efficient representation of extremely large translation rulesets. In phrase-based models, this problem can be addressed by storing the training data in memory and using a suffix array as an efficient index to quickly lookup and extract rules on the fly. Hierarchical phrasebased translation introduces the added wrink...

متن کامل

A General-Purpose Rule Extractor for SCFG-Based Machine Translation

We present a rule extractor for SCFG-based MT that generalizes many of the contraints present in existing SCFG extraction algorithms. Our method’s increased rule coverage comes from allowing multiple alignments, virtual nodes, and multiple tree decompositions in the extraction process. At decoding time, we improve automatic metric scores by significantly increasing the number of phrase pairs th...

متن کامل

Simple and Efficient Model Filtering in Statistical Machine Translation

Data availability and distributed computing techniques have allowed statistical machine translation (SMT) researchers to build larger models. However, decoders need to be able to retrieve information efficiently from these models to be able to translate an input sentence or a set of input sentences. We introduce an easy to implement and general purpose solution to tackle this problem: we store ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009